$RCSfile: Memory.txt $ Description: Description of the Oberon-A memory management system Created by: fjc (Frank Copeland) $Revision: 1.2 $ $Author: fjc $ $Date: 1994/05/13 19:28:33 $ Copyright © 1994, Frank Copeland. This file is part of Oberon-A. See Oberon-A.doc for conditions of use and distribution. ________________________________________________________________________ Introduction ------------ This document describes the design and implementation of Oberon-A's memory management. This covers the allocator (NEW and SYSTEM.NEW), deallocator (SYSTEM.DISPOSE) and garbage collector (SYSTEM.GC). The design used for Oberon-A is based closely on the designs discussed in the Oberon Technical Notes 2 and 5, (see TechNotes.doc). Memory blocks ------------- Memory allocation in Oberon must take into account a number of factors that require more than just requesting a block of memory from the operating system. Oberon's run-time type checking requires that every record variable that is dynamically allocated have an hidden type tag preceding it. The garbage collector must be able to distinguish between memory allocated for records, arrays and untyped blocks (created by SYSTEM.NEW). CPointers and BPointers must be treated differently to normal Oberon pointers. Three types of memory block are recognised: RecordBlk : memory allocated to a POINTER TO variable. ArrayBlk : memory allocated to a POINTER TO variable. SysBlk : memory allocated to a CPointer or BPointer variable, or allocated with SYSTEM.NEW. A RecordBlk looks like this, where v is a variable of type POINTER TO T, T is a record type: RecordBlk = RECORD link : ADDRESS; (* next element in list of blocks *) tag : ADDRESS; (* type descriptor of type T *) v --> data : ARRAY SIZE(T) OF BYTE END An ArrayBlk looks like this, where v is a variable of type POINTER TO ARRAY n1, n2 ... ni-1 OF T: ArrayBlk = RECORD arrpos : LONGINT; (* used by the garbage collector *) elemSize : LONGINT; (* SIZE (T), for the convenience of the garbage collector *) size : LONGINT; (* n1 * n2 ... * ni-1 * SIZE(T) *) link : ADDRESS; (* next element in list of blocks *) tag : ADDRESS; (* type descriptor of type T *) v --> data : ARRAY size OF BYTE END A SysBlk looks like this, where v is any pointer variable: SysBlk = RECORD link : ADDRESS; (* next element in list of blocks *) size : LONGINT; (* size of block *) v --> data : ARRAY size OF BYTE END The type of the block is determined by bits encoded in the tag or size field at address: v - 4. Bits 0 and 1 are used for this encoding: they are available to be used because the type descriptors are guaranteed to be longword-aligned and the allocator ensures that block sizes are multiples of 4 bytes. Bit 0 is set to indicate that the block is a SysBlk. Bit 1 is set to indicate that the block is an ArrayBlk. If bit 1 is clear, the block is a RecordBlk. This encoding ensures that the type tag of a RecordBlk is a valid pointer to a type descriptor, which is necessary for fast run-time type checks and guards. For the other block types, the encoded bits must be masked out before the tag or size field can be used. The link field (at address: v - 8) in each type of block is used to implement a singly-linked list of allocated blocks. It is either NIL, or points to the link field in the next block in the list. Blocks are added to the list by the allocator and removed by the deallocator and garbage collector. All allocated blocks are freed when the program terminates. Two seperate lists are maintained: one of traced blocks known to the garbage collector and one of untraced blocks. The size, elemSize and arrpos fields in an ArrayBlk block are required by the garbage collector (see Garbage collection). Allocating memory ----------------- The allocator is implemented in two procedures in the run-time support library: OberonSys_NEW and OberonSys_SYSNEW. OberonSys_NEW is used to allocate RecordBlk and ArrayBlk blocks and OberonSys_SYSNEW is used to allocate SysBlk blocks. The standard procedure SYSTEM.NEW always translates into a call to OberonSys_SYSNEW while the standard procedure NEW can translate into a call to either, depending on the type of variable. Oberon_NEW requires two parameters: size (passed D0): this is only required for an ArrayBlk and is the size of the block to be allocated. tag (passed in D1): a pointer to a type descriptor. Bit 1 is set if an ArrayBlk is required. For a RecordBlk block, OberonSys_NEW calculates the size of the block by adding 8 to the size of the record found in the type descriptor. For an ArrayBlk block, it adds 20 to the size parameter. The result is rounded up to the next multiple of 4 bytes. It then asks Exec for a block of memory of the required size, using AllocMem (size, {memClear}) so that the block is zeroed. If successful, it initialises the block's hidden fields, links it into the relevant list and returns the address of the data field in D0. By default, the allocated block is placed in the list of blocks known to the garbage collector. If the allocation fails, NIL is returned in D0. OberonSys_SYSNEW requires two parameters: size (passed in D0): the number of bytes to be allocated. traced (passed in D1.B): a flag indicating if the variable is traced. OberonSys_SYSNEW adds 8 to the size parameter and rounds the result up to the next multiple of 4 bytes. It then asks Exec for a block of memory of the required size, using AllocMem (size, {memClear}) so that the block is zeroed. If successful, it initialises the block's hidden fields, links it into the relevant list and returns the address of the data field in D0. If traced is TRUE, the allocated block is placed in the list of blocks known to the garbage collector, otherwise it is placed in an alternate list and will not be garbage collected. If the allocation fails, NIL is returned in D0. Type tags and descriptors ------------------------- RecordBlk and ArrayBlk blocks both contain a type tag, which is a pointer to a type descriptor. A type descriptor is always associated with a record type. Information in the type descriptor is used by the allocator and the garbage collector. The structure of a type descriptor is: TypeTag = POINTER TO TypeDesc; TypeDesc = RECORD size : LONGINT; ttable : ARRAY 8 OF TypeTag; offsets : ARRAY N+1 OF LONGINT END size holds the size in bytes of an object of that type. This is used by the allocator to determine the size of a RecordBlk. ttable holds a table of pointers to the descriptors of the base types of the descriptor's type. This is used in type tests and guards. The size is fixed so that the garbage collector can always locate the offsets field at a fixed offset from the base of the type descriptor. This limits the depth to which types can be extended. The depth chosen (8) is the same as that used in the Oberon System and should be adequate for most purposes. The offsets field is an array holding the offsets of the pointer fields in the type; N is the number of such fields. This is used in the garbage collector's mark phase (see Garbage collection). Deallocating memory ------------------- Memory is explicitly deallocated with the standard procedure SYSTEM.DISPOSE. This is implemented in the run-time support procedure OberonSys_DISPOSE. OberonSys_DISPOSE has one parameter: block (passed in D0): the address of the block to be freed. OberonSys_DISPOSE scans the lists of allocated blocks looking for block; the untraced list is scanned first. If it does not find block, it halts the program. Otherwise, block is removed from the list and the memory is returned to the operating system using the Exec function FreeMem. SYSTEM.DISPOSE was originally intended to take the place of the garbage collector, which was not implemented until version 0.4 of the compiler. Wherever possible, the garbage collector should be relied on instead, especially if there is any chance that a block will be assigned to more than one variable. The use of SYSTEM.DISPOSE should be restricted to freeing untraced variables and variables that are strictly local to a single procedure. Garbage collection ------------------ Garbage is dynamically allocated memory that is no longer referenced by any variable. If it is not freed, it will cause memory leaks and fragmentation. On the other hand, memory freed while variables still reference it may cause bugs due to dangling pointers. The safest way to free garbage is by using a garbage collector, which traces all pointer variables to determine which memory is still being referenced. The original design of the Oberon language assumed that it would be operating in an environment which provided automatic garbage collection. As a result, the language contains a NEW procedure, but no DISPOSE. This assumption was later modified and implementators are now expected to provide a SYSTEM.DISPOSE procedure if they do not implement a garbage collector. Oberon-A now provides both. It implements the collector in the OberonSys_GC procedure in the run-time support code of each program. The garbage collector must be explicitly activated by the program calling the standard procedure SYSTEM.GC. The current implementation is not able to locate pointer variables on the stack; it can only trace global variables. This means that the garbage collector must *not* be activated at any point in the program where pointer variables local to an active procedure are referencing allocated memory. This will typically restrict it to the main body of the program, or the main event loop, where the programmer can guarantee that only global variables contain references to dynamically allocated memory. If a program does not call SYSTEM.GC, all dynamically allocated memory will remain allocated until the program ends. In many cases, especially in small utility programs, this will be preferable. The Oberon-A collector uses a mark-and-sweep algorithm. It first marks the allocated memory blocks that are reachable through global variables. It then sweeps through all the allocated blocks, freeing any that were not found in the mark phase. The algorithm used for the mark phase of the collector is described in the Oberon Technical Notes, part 5; see also part 2 for an earlier version of the algorithm. The sweep phase consists of scanning the list of traced blocks built by the allocator, freeing the unmarked blocks and unmarking the marked ones. The garbage collector must be able to find the global variables of each module in the program. To enable this, the compiler creates a data hunk for each module with global pointer variables called "_GC". This hunk has the following structure: GCHunkPtr = POINTER TO GCHunk; GCHunk = RECORD link : GCHunkPtr; varBase : ADDRESS; vars : ARRAY N+1 OF LONGINT END; (* GCHunk *) The hunks for all of a program's modules are inserted in a singly linked list through the link field in each hunk. The list is anchored in a global variable called OS_GCVars in the run-time support code. The initialisation code of each module ensures that the module's hunk is inserted in the list. The varBase points to the base of the module's variables. The vars field is a list of offsets from varBase, one for each pointer variable. The end of the list is marked by a negative offset. The mark phase of the collector starts at the hunk pointed to by OS_GCVars and follows the pointers in the link fields of all the hunks in the list, marking the variables listed in each hunk. CPointer and BPointer variables are not traced by the garbage collector. A RecordBlk or ArrayBlk block may itself contain pointers to other blocks. The collector must mark these variables as well. To do this it uses the information in the type descriptor pointed to by the type tag in each block. See Type tags and descriptors. A block is marked by setting a bit in the block's type tag. This presents a problem, since the available low-order bits are already used to encode the block type. The most-significant bit is used instead: when it is set, the block is marked. This bit was chosen on the assumption that it is extremely unlikely to be used in a valid address for a type descriptor. This is a fairly safe assumption, as few Amigas have more than 2GB of memory installed, but in theory this might lead to problems. As a belt-and-braces concession to safety, the memory allocator checks this bit in any type tag passed to it and halts the program if it is set.